荷兰专利NL2023815A Numerical simulation method for unstructured grid tides and tidal currents based on gpu computation

专利PDF首页>>荷兰专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
There is provided a numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, comprising: sending grid information and physical field information to GPU memory; computing internal grid cell face flux and volume source item by first kernel function using grid cell as computational cell and corresponding to GPU thread; computing boundary grid cell face flux by second kernel function using grid cell face as computational cell and corresponding to GPU thread; performing time advance computation using third kernel function using cell as computational cell and corresponding to GPU thread; returning result to CPU by GPU; wherein multiple GPUs are used for parallel computation on internal and boundary grid cell face flux; time advance. The invention overcomes low computational efficiency problem caused by simple using grid cell or grid cell face as computation cell by separating flux computation of internal and boundary grid cell face.
公开号:NL2023815A
申请号:NL2023815
申请日:2019-09-11
公开日:2019-10-22
发明作者:Yan Bing；Sun Huawen；Jin Wenliang；Huang Yuxin；Yao Shanshan；Yang Hua；Zhao Zhangyi；Hou Zhiqiang；Ouyang Qunan；Xie Lin；Duan Lili；Xia Fengyong
申请人:Tianjin Research Inst Water Transp Engineering Mot；
IPC主号:

专利说明:

TECHNICAL FIELD [0001] The present invention relates to the field of fluid dynamics, and more particularly to a numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology.
BACKGROUND [0002] Computational Fluid Dynamics (CFD) is a subject that uses computers and numerical methods to solve fluid dynamic equations for obtaining flow laws and solving flow problems. It involves computational geometry, fluid dynamics, mathematical theory of partial differential equation, numerical analysis, and so on. At present, CFD is developing towards high precision, large scale, multi-objective and real-time. The demand for computation and storage is increasing day by day. It is inevitable that parallel computation is performed by large parallel computers. With the continuous improvement of floating-point arithmetic performance of Graphics Processing Cell (GPU), CPU/GPU heterogeneous architecture is often used in the construction of large parallel computers to improve performance. This brings development opportunities for CFD applications such as low cost and fast solution, and also brings challenges for many high performance computational researchers in algorithm design.
[0003] The application of unstructured grid to establish hydrodynamic models has been widely used at present. With the development of ocean hydrodynamic models, new requirements are placed on computation accuracy and efficiency. However, the improvement of accuracy will lead to too many grids and too much computation. It is difficult to obtain a computation result in a short time without large-scale clusters and can not meet prediction requirements. With the rapid development of graphics processing cell (GPU) performance, and the expansion and maturity of parallel computation languages support for GPU structures such as CUDA and OPENCL, a parallel algorithm based on GPU can effectively accelerate the computation speed of hydrodynamic models, and effectively complete numerical simulation of hydrodynamics in high-resolution marine environments.
[0004] At present, the GPU parallel algorithm design for two-dimensional hydrodynamic model can be divided into two categories in terms of computational cells and CUDA thread mapping. One is to use a cell as a basic computational cell, and the computational task on the cell is rewritten into the CUDA Kernel function and mapped onto CUDA thread. There is repeated computation of flux on the cell face in this way, but because it is kernel concurrency, and the repeated computation is performed concurrently and synchronously, so there is no significant impact on efficiency. But the hydrodynamic model discrete computation needs to convert the plane integral into the line integral along the face of the control cell. At the same time, because there are boundaries in the computational region, different flux computation methods are often needed for internal cell face and the boundary cell face in numerical algorithms, which will lead to the computational cell's kernel with the cell as the basic computational cell having computational branches that can greatly degrade performance. The other is to use a cell face as a basic computational cell, and the computational task on the cell face is rewritten into the CUDA Kernel function and mapped onto CUDA thread. The computation result of a common edge can be called by two adjacent control cells, which reduces the computation by half. However, in the parallel design process, multiple threads are prone to operate on the same data at the same time. If the program design is unreasonable, it will lead to confusion of data operations by different threads. Therefore, it is necessary to lock the data in order to realize the atomic operation of the data. In the process of unstructured grid discrete computation, because of the irregularity of grid number, a large number of atomic operations will exist, which will affect kernel performance efficiency.
[0005] On the other hand, GPU optimization for two-dimensional hydrodynamic models is performed on a single GPU, without considering optimization on distributed GPU clusters.
SUMMARY [0006] A purpose of the present invention is to solve at least above problems, and to provide, at least, the advantages that will be described later.
[0007] Another purpose of the present invention is to provide a numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, which overcomes low computational efficiency caused by a grid cell or grid cell face as the basic computational cell by separating the flux computation of internal grid cell face and boundary grid cell face, and the optimization of a two-dimensional hydrodynamic model on a distributed GPU cluster is realized simultaneously, which further improves computation efficiency.
[0008] To achieve the above mentioned object and other advantages, the present invention adopts the following technical solution:
[0009] a numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, comprising the following steps of:
51, sending grid information and physical field information on CPU to GPU memory by CPU;
52, computing internal grid cell face flux andvolume source item on the grid cell by a first kernel function and a pre-stored grid information and physical field information in the GPU and using a grid cell as a basic computational cell and corresponding to GPU thread;
53, computing boundary grid cell face flux by a second kernel function and the pre-stored grid information and physical field information in the GPU and using grid cell face as a basic computational cell and corresponding to GPU thread;
54, performing time advance computation using a third kernel function and the grid cell face flux obtained by S2 and S3 and using a cell as a basic computational cell and corresponding to GPU thread;
55, returning computation result of S4 to the CPU by the GPU;
wherein multiple GPUs are used to perform parallel computation on the internal grid cell face flux, the boundary grid cell face flux, or the time advance in the S2, S3, and S4.
[00010] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the grid information comprises an internal grid cell face topology and geometric information of the internal grid cell face topology, and a boundary grid cell face topology and geometric information of the boundary grid cell face topology; and the physical field information comprises a dry and wet state of the cell and the cell face, a cell physical quantity and a boundary physical quantity.
[00011] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the first kernel function, the second kernel function, and the third kernel function are all CUDA kernel functions which are written in CUDA language and include flux, source item, boundary condition, and time advance.
[00012] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the flux includes internal grid cell face flux, boundary grid cell face flux, source item and time item advance.
[00013] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the internal grid cell face flux and the volume source item on the grid cell are computed by the first kernel function which uses a grid cell as a basic computational cell and corresponds to GPU thread, so as to realize the parallel computation of the flux computation on the internal grid cell face.
[00014] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the boundary grid cell face flux are computed by the second kernel function which uses a grid cell face as a basic computational cell and corresponds to GPU thread, so as to realize the parallel computation of the flux computation on the boundary grid cell face.
[00015] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, before internal grid cell face flux, boundary grid cell face flux, and time item advance are performed parallel computation by multiple GPU in the S2, S3 and S4, an initial computational grid cell needs to be performed grid cell region partition, and ensure that each grid cell obtained by the grid cell region partition only includes one grid cell boundary face;
wherein the grid region partition refers to partitioning the grid cell into corresponding sub-regions according to the number of GPU, and the specific partition method includes the steps of
Sl-1, converting a grid file to a graphic file;
51- 2, invoking graphic partition tools pmetis and kmetis in the graphic partition software package Metis, and performing the grid cell region partition of the graphic file.
[00016] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, each of the GPUs are used to compute one of the sub-regions, and the GPU are used to compute a dry-wet boundary process, a boundary computation, a convection term, a diffusion term, and a source item of the sub-region by a kernel function.
[00017] Preferably, in the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, asynchronous communication technology is used between each GPU to realize data communication and computation overlap, and the specific method includes the following steps of:
52- 1, copying parallel grid cell boundary face flow rate that needs to communicate from the GPU to the CPU memory before computation by the kernel function;
S2-2, communicating the parallel grid cell boundary face flow rate using MPI non-blocking communication by the CPU, and computing internal grid cell face flux independent of the parallel boundary by the kernel function;
S2-3, performing discrete computations of the parallel grid cell boundary face by the kernel function based on the parallel grid cell boundary face flow rate received and uploaded to the GPU.
[00018] The present invention comprises at least the following beneficial effects compared to the related art:
In the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology, the internal grid cell face flux, the volume source item on the grid cell and boundary grid cell face flux are computed by the first kemel function and the second kernel function, respectively. It realizes separation computation of the internal cell face and the boundary cell face, so that the computation efficiency is greatly improved comparing with the existing computation based on cell or cell face as a basic computational cell.
The internal grid cell face flux and the boundary grid cell face flux are performed parallel computation by using multiple GPUs to realize optimization of the two-dimensional hydrodynamic model on the distributed GPU cluster, and computation efficiency is further improved.
Data transfer between the GPU and the CPU is not required in the computation process by mean of numerical computations on the GPU, which provides a prerequisite for efficient parallelism.
BRIEF DESCRIPTION OF THE DRAWINGS [00019] Fig. 1 is a flow diagram of the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology according to the present invention.
[00020] Fig. 2 is a schematic diagram of grid partition according to the present invention.
[00021] Fig. 3 is a flow diagram of using asynchronous communication technology between GPUs to realize data communication and computation overlap.
[00022] Fig. 4 is a schematic diagram of a computation range and a water depth according to one embodiment of the present invention.
[00023] Fig. 5 is an overall grid diagram of the computation range according to one embodiment of the present invention.
[00024] Fig. 6 is a partial grid diagram of the computation range according to one embodiment of the present invention.
[00025] Fig. 7 is a schematic diagram of a monitoring point according to one embodiment of the present invention.
[00026] Fig. 8 is a comparison diagram of tidal level of monitoring point 1 according to an embodiment of the present invention.
[00027] Fig. 9 is a comparison diagram of tidal level of monitoring point 2 according to an embodiment of the present invention.
[00028] Fig. 10 is a comparison diagram of tidal level of monitoring point 3 according to an embodiment of the present invention.
[00029] Fig. 11 is a comparison diagram of tidal level of monitoring point 4 according to an embodiment of the present invention.
DETAILED DESCRIPTION OF EMBODIMENTS [00030] The present invention will be described in further detail with reference to the accompanying drawings below in order to enable person skilled in the art to practice with reference to the description.
[00031] As shown in Fig. 1, a numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology:
51, sending grid information and physical field information on CPU to GPU memory by CPU;
52, computing internal grid cell face flux and the volume source item on the grid cell by a first kernel function and a pre-stored grid information and the physical field information in the GPU and using a grid cell as a basic computational cell and corresponding to GPU thread;
53, computing boundary grid cell face flux by a second kernel function and the pre-stored grid information and physical field information in the GPU and using grid cell face as a basic computational cell and corresponding to GPU thread;
54, performing time advance computation using a third kernel function and the grid cell face flux obtained by S2 and S3 and using a cell as a basic computational cell and corresponding to GPU thread;
55, returning computation result of S4 to the CPU by the GPU;
wherein multiple GPUs are used to perform parallel computation on the internal grid cell face flux, the boundary grid cell face flux, or the time advance in the S2, S3, and S4.
[00032] In the above solution, the internal grid cell face flux, the volume source item on the grid cell and the boundary grid cell face flux are computed by the first kernel function and the second kernel function, respectively. It realizes separation computation of the internal cell face and the boundary cell face, so that the computation efficiency is greatly improved comparing with the existing computation based on a cell or a cell face as a basic computational cell.
[00033] The internal grid cell face flux and the boundary grid cell face flux are performed parallel computation by using multiple GPUs to realize the optimization of the two-dimensional hydrodynamic model on the distributed GPU cluster, and computation efficiency is further improved.
[00034] Since the GPU and the CPU have independent physical memory, the cudaMemcpy function must be called to implement data interaction. The data interaction speed is limited by bandwidth, which often becomes a bottleneck for GPU program acceleration. Therefore, by performing numerical computation on the GPU, data transfer between the GPU and the CPU is not required in the computation process, which provides a prerequisite for efficient parallelism.
[00035] In a preferred solution, the grid information comprises an internal grid cell face topology and geometric information of the internal grid cell face topology, and a boundary grid cell face topology and geometric information of the boundary grid cell face topology; and the physical field information comprises a dry and wet state of the cell and the cell face, a cell physical quantity and a boundary physical quantity.
[00036] In a preferred solution, the first kernel function, the second kernel function, and the third kernel function are all CUDA kernel functions which are written in CUDA language and include flux, source item, boundary condition, and time advance. [00037] In the above solution, the kemel function (kemel) is used as a basic cell in the CUDA, and many threads are started on the GPU to perform concurrently according to stream processor capability. In the numerical computation method, computation of flux, source item and boundary conditions and time item advance are written as kernel functions by CUDA language for single GPU, and are transferred to the GPU for computations to improve computational efficiency.
[00038] In a preferred solution, the flux includes internal grid cell face flux, boundary grid cell face flux, source item and time item advance.
[00039] In a preferred solution, the internal grid cell face flux and the volume source item on the grid cell are computed by the first kernel function which uses a grid cell as a basic computational cell and corresponds to GPU thread, so as to realize the parallel computation of the flux computation on the internal grid cell face.
[00040] In the above solution, the basic computational cell in the computation corresponds to the CUDA thread. The computation of internal cell flux mainly involves the computation of internal cell face flux, source item on the cell and time item advance. When the kernel is concurrent, the cell is used as the basic computational cell and corresponds with the thread. The internal cell face flux and the volume source item on the cell are computed, so that the concurrency of the flux computation on the cell can be achieved.
[00041] In a preferred solution, the boundary grid cell face flux are computed by the second kernel function which uses a grid cell face as a basic computational cell and corresponds to GPU thread, so as to realize the parallel computation of the flux computation on the boundary grid cell face.
[00042] In the above solution, the computation process for the boundary cell face is often different from the internal cell face. Therefore, the second kernel is used for processing. Considering that the boundary cell face computation is simply a flux computation added to the corresponding boundary cell, so the cell face is used as the basic cell to compute the face flux and correspond to the CUDA thread for realizing concurrency of the boundary cell face flux.
[00043] In a preferred solution, before internal grid cell face flux, boundary grid cell face flux, and time item advance are performed parallel computation by multiple GPU in the S2, S3 and S4, an initial computational grid cell needs to be performed grid cell region partition, and ensure that each grid cell obtained by the grid cell region partition only includes one grid cell boundary face; wherein the grid region partition refers to partitioning the grid cell into corresponding sub-regions according to the number of GPU, and the specific partition method includes the steps of:
S1-1, converting a grid file to a graphic file;
SI-2, invoking graphic partition tools pmetis and kmetis in the graphic partition software package Metis, and performing the grid cell region partition of the graphic file.
[00044] In the above solution, the grid needs to be divided into corresponding sub-regions according to the number of nodes. The grid file is converted into a graphic file by related parameter control, then the grid partition tool pmetis and kmetis tools provided by Metis are used to perform grid partition, so that the region partition of arbitrary shape grids and hybrid grids can be realized. For example, for an unstructured grid as shown in Fig. 2, the cell number begins with the letter c, and the interface number begins with the number directly. The internal interface of one cell is adjacent to two cells, whereby the cell interface can be converted into an edge in the graphic file, and the cell number is turned into two nodes of the edge. Such as the cell cl and c6 on both sides of the unstructured grid interface 6 corresponds to the edge 6 and the nodes cl and c6 in the graphic data in the figure. After all the grid boundary faces and the cells are converted, the graphic data shown in the figure can be obtained. For example, the cell cl in the grid file is adjacent to the cells c6, c5, and c2. After conversion, the node cl can be seen adjacent to the nodes c6, c5, and c2, thereby obtaining graphic data, then the Metis graphic partition tool can be called to perform partition, and finally, the ordinal number of each node's region in the graphical data is obtained. By mapping the ordinal number into unstructured grids, the ordinal number of the partition region of each computational cell in the original unstructured grids can be obtained. The grid topological information will be used for discrete solution of subsequent hydrodynamic models.
[00045] As shown in Fig.3, in a preferred solution, asynchronous communication technology is used between each GPU to realize data communication and computation overlap, and the specific method includes the following steps of
S2-1, copying parallel grid cell boundary face flow rate that needs to communicate from the GPU to the CPU memory before computation by the kernel function;
S2-2, communicating the parallel grid cell boundary face flow rate using MPI non-blocking communication by the CPU, and calculating internal grid cell face flux independent of the parallel boundary by the kernel function;
S2-3, performing discrete computations of the parallel grid cell boundary face by the kernel function based on the parallel grid cell boundary face flow rate received and uploaded to the GPU.
[00046] In the above solution, in the distributed system environment, because the GPU memory between nodes cannot communicate directly, special design is needed to ensure computational load balance and data communication efficiency between nodes. Computation efficiency of the parallel computation in a distributed system environment depends on the proportion of communication time in the total computation time. Therefore, the asynchronous communication technology is used to realize the data communication and computation overlap between nodes, and the data exchange between the parallel boundary face and the discrete computation of the internal cell face are concurrently performed, thereby achieving the purpose of concealing the communication time of data exchange.
[00047] Embodiment [00048] Example test: simulation of tides and tidal currents in the Bohai sea [00049] 1, Example description [00050] Taking the connection between Yantai and Dalian as a boundary, the tidal wave transmission and tidal currents movement in the Bohai sea range are computed. The computation range and terrain are shown in Fig. 4. The number of grids nodes is 60307, the number of cells is 117142, the maximum grid space step is 9758m, and the minimum grid space step is 40m, as shown in Fig. 5 and Fig.6.
[00051] The computational efficiency and results of CUD A parallel program are tested and analyzed using the numerical simulation method for unstructured grid tides and tidal currents based on GPU computation technology of the present invention, and It is mainly divided into two parts:
[00052] (1) testing computation result comparison between GPU and CPU and other software;
[00053] (2) testing computational efficiency of GPU.
[00054] 2, CPU test environment [00055] (1) computing platform:
[00056] Power Leader PR8800G eight-way parallel computer [00057] Eight Intel Xeon Processor E7-8867 v3 (2.5GHz/16C/45M/165W/9.6G) [00058] 24X16GB/DDR4/2133MHz/ECC/REG/2RANK [00059] 5X 900GB/SAS/10000RPM/2.5 inch/enterprise level [00060] (2) operating system [00061] Red Hat Enterprise Linux Server release 7.2 (Linux version 3.10.0-327.el7.x86_64) [00062] 3, GPU test environment [00063] TH-1A system, gpu test computing partitions, Tesla M2050 and Tesla K20m graphics card.
[00064] 4, computation result [00065] As shown in Fig. 7, 1-4 monitoring points are selected within computation range. Fig.8-Fig.ll show the comparison of the tidal level at different points computed by the GPU parallel program and the commercial hydrodynamic computation software MIKE and CPU computation results. It can be seen that the tidal level results computed by the GPU parallel program is consistent with the tidal level results computed by the commercial hydrodynamic computation software MIKE, and the CPU version of the program, indicating that the GPU computation result of the program is accurate.
[00066] Computing time statistics of GPU and CPU and other software are shown in Table 1 below.
[00067] Table 1 computing time statistics of GPU and CPU and other software
Computing platformcomputing time (seconds)32cores (Intel Xeon 5670 CPU)57640 cores (Intel Xeon E7-8867 v3 CPU)2401 GPU(Intel Xeon 5670 + Tesla M2050)7171 GPU(Intel Xeon 5670 + Tesla K20m)578
[00068] As can be seen from Table 1, the computation efficiency of K20m is basically equivalent to that of 32 cores.
[00069] Although embodiments of the present invention have been disclosed as above, they are not limited to the implementations listed in the specification and embodiments. They can be applied to all kinds of fields suitable for the present 5 invention. Additional modifications can be easily implemented to those who are familiar with the field. Therefore, the present invention is not limited to specific details and the legends shown and described herein without deviation from the general concepts defined in the claims and the equivalents thereof.

权利要求:
Claims (9)
[1]
CONCLUSIONS
A method for numerical simulation with an unstructured grid of tides and tidal currents based on GPU calculation technology, comprising the following steps:
sl: sending raster information and physical field information on CPU to GPU memory per CPU;
s2: calculating flux from inner grid cell plane and volume source item on a grid cell by means of a first kernel function and a pre-stored raster information and physical field information in the GPU and by using the raster cell as a basic calculation cell and corresponding to GPU thread;
step 3: calculating flux from boundary grid cell plane by means of a second kernel function and the pre-stored raster information and physical field information in the GPU and by using a raster cell plane as a basic calculation cell and corresponding to GPU thread;
step 4: performing time calculation calculation by means of a third core function and the flux of grid cell plane obtained by S2 and S3 and by using a cell as a basic calculation cell and corresponding to GPU thread;
step 5: returning the calculation result from S4 to the CPU by the GPU;
wherein multiple GPUs are used to perform parallel calculations on the flux of inner grid cell plane, the flux of boundary grid cell plane, or the time progress in S2, S3, and S4.
[2]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that the grid information has a topology of inner grid cell plane and geometric information of the topology of inner grid cell plane, and a topology of boundary grid cell plane and geometric includes information from the topology of border grid cell surface '; and the physical field information comprises a dry and wet state of the cell and the cell surface.
[3]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that wherein the first kernel function, the second kernel function and the third kernel function are all CUDA kernel functions that in the CUDA are written in language and include flux, source item, boundary condition and time progress.
[4]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that the flux comprises the flux of inner grid cell plane, the flux of boundary grid cell plane, the source item and the time item progress.
[5]
Method for numerical simulation with unstructured grid of tides and tidal flows based on GPU calculation technology according to claim 1, characterized in that the flux of inner grid cell plane and the volume source item on the grid cell are calculated by the first kernel function using the grid cell as a basic calculation cell and corresponds to GPU thread, to realize a parallel calculation with the flux calculation on the inner grid cell plane.
[6]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that the flux of boundary grid cell plane is calculated by the second kernel function using the grid cell plane as a basic calculation cell and corresponds to GPU thread, to realize a parallel calculation with the flux calculation on the boundary grid cell plane.
[7]
7. Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that before the flux of inner grid cell plane, the flux of boundary grid cell plane and the time item claim are subjected to parallel calculation by several GPUs in the S2, S3 and S4, it is necessary to subject an initially calculated frame cell to the frame cell region partition, and it is ensured that each frame cell obtained by the frame cell region partition comprises only one frame cell interface; wherein the grid cell region partition is understood to mean dividing the grid cell into respective subregions according to the number of GPUs, and the specific partitioning method comprises the steps of:
S1 -1: converting a raster file to a graphic file;
51-2: calling up the pmetis and kmetis tools for graphic partition in the Metis software package for graphic partition, the Metis graphic partition software package, to execute the grid cell region partition of the graphic file.
[8]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 7, characterized in that each of the GPUs is used to calculate one of the subregions, and the GPU is used to of the kernel function, a dry-wet boundary process, a boundary calculation, a convection term, a diffusion ten and a source item of the subregion.
[9]
Method for numerical simulation with unstructured grid of tides and tidal currents based on GPU calculation technology according to claim 1, characterized in that asynchronous communication technology between GPUs is used to realize data communication and calculation overlap, and the specific method comprises the following steps:
52-1: copying a flow rate from the interface of the parallel grid cell, which is to communicate, from the GPU to the CPU memory, before calculation by the kernel function;
S2-2: communicating the flow rate of the interface of the parallel grid cell using non-blocking MPI communication by the CPU, and calculating the flux of inner grid cell plane independently of the parallel boundary by the kernel function;
S2-3: performing discrete calculations of the interface of the parallel grid cell by the core function based on the flow velocity of the interface of the parallel grid cell received and uploaded to the GPU.

Graphic date

Graphic partition
Unstructured grid

类似技术:

公开号 | 公开日 | 专利标题

Lacasta et al.2014|An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes

Zhong et al.2014|Data partitioning on multicore and multi-GPU platforms using functional performance models

Zhou et al.2016|Optimization of parallel iterated local search algorithms on graphics processing unit

Zhang et al.2014|Parallel computation of a dam-break flow model using OpenMP on a multi-core computer

Khaleghzadeh et al.2018|A novel data-partitioning algorithm for performance optimization of data-parallel applications on heterogeneous HPC platforms

JP2018190450A|2018-11-29|Efficient determination of join paths via cardinality estimation

Ayala et al.2014|DNS of hydrodynamically interacting droplets in turbulent clouds: Parallel implementation and scalability analysis using 2D domain decomposition

NL2023815B1|2020-08-19|Numerical simulation method for unstructured grid tides and tidal currents based on gpu computation technology

Cuomo et al.2017|A parallel pde-based numerical algorithm for computing the optical flow in hybrid systems

Wang et al.2012|DDDAS-based parallel simulation of threat management for urban water distribution systems

Zhang et al.2017|Parallel computation of a dam-break flow model using OpenACC applications

Holmes et al.2011|A framework for parallel computational physics algorithms on multi-core: SPH in parallel

Ma et al.2015|GPU parallelization of unstructured/hybrid grid ALE multigrid unsteady solver for moving body problems

Wu et al.2011|Parallel artificial neural network using CUDA-enabled GPU for extracting hydraulic domain knowledge of large water distribution systems

Loring et al.2020|Improving performance of m-to-n processing and data redistribution in in transit analysis and visualization

Deng et al.2016|CPU/GPU computing for an implicit multi-block compressible Navier-Stokes solver on heterogeneous platform

Lastovetsky et al.2018|How pre-multicore methods and algorithms perform in multicore era

Ameli et al.2014|Development of an efficient and flexible pipeline for Lagrangian coherent structure computation

Zhang et al.2016|Implementation and efficiency analysis of parallel computation using OpenACC: a case study using flow field simulations

Biswas et al.1996|Global load balancing with parallel mesh adaption on distributed-memory systems

Stojanovic et al.2015|A hybrid MPI+ OpenMP application for processing big trajectory data

Gidra et al.2011|Parallelizing TUNAMI-N1 Using GPGPU

Zagidullin et al.2019|Supercomputer Modelling of Spatially-heterogeneous Coagulation using MPI and CUDA

Degtyarev et al.2019|Virtual testbed: ship motion simulation for personal workstations

Krishnamurthy et al.2014|Parallel MATLAB Techniques

同族专利:

公开号 | 公开日

CN112035995A|2020-12-04|

NL2023815B1|2020-08-19|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

CN113706706B|2021-10-28|2022-02-01|自然资源部第一海洋研究所|Data processing method and device and electronic equipment|

法律状态:

优先权:

申请号 | 申请日 | 专利标题

CN201910654061.2A|CN112035995A|2019-07-19|2019-07-19|Nonstructural grid tidal current numerical simulation method based on GPUcomputing technology|

[返回顶部]